Automatic phone set extension with confidence measure for spontaneous speech
نویسندگان
چکیده
Extending the phone set is one common approach for dealing with phonetic confusions in spontaneous speech. We propose using likelihood ratio test as a confidence measure for automatic phone set extension to model phonetic confusions. We first extend the standard phone set using dynamic programming (DP) alignment to cover all possible phonetic confusions in training data. Likelihood ratio test is then used as a confidence measure to optimize the extended phonetic units to represent the acoustic samples between two standard phonetic units with high confusability. The optimum set of extended phonetic units is combined with the standard phone set to form a multiple pronunciation dictionary. The effectiveness of this approach is evaluated on spontaneous Mandarin telephony speech. It gives an encouraging 1.09% absolute syllable error rate reduction. Using the extended phone set provides a good balance between the demands of high resolution acoustic model and the available training data.
منابع مشابه
Automatic speech summarization based on sentence extraction and compaction
This paper proposes a new automatic speech summarization method having two stages: important sentence extraction and sentence compaction. Relatively important sentences are extracted based on the amount of information and the confidence measures of constituent words, and the set of extracted sentences is compressed by our sentence compaction method. The sentence compaction is performed by selec...
متن کاملWeakly-supervised text-to-speech alignment confidence measure
This work proposes a new confidence measure for evaluating text-to-speech alignment systems outputs, which is a key component for many applications, such as semi-automatic corpus anonymization, lips syncing, film dubbing, corpus preparation for speech synthesis and speech recognition acoustic models training. This confidence measure exploits deep neural networks that are trained on large corpor...
متن کاملPronunciation Modeling for Spontaneous Mandarin Speech Recognition
Pronunciation variations in spontaneous speech can be classified into complete changes and partial changes. A complete change is the replacement of a canonical phoneme by another alternative phone, such as ‘b’ being pronounced as ‘p’. Partial changes are variations within the phoneme such as nasalization, centralization and voiced. Most current work in pronunciation modeling for spontaneous Man...
متن کاملTwo-stage Automatic Speech Summarization by Sentence Extraction and Compaction
This paper proposes a new automatic speech summarization method having two stages: important sentence extraction and sentence compaction. Relatively important sentences are extracted from the results of large-vocabulary continuous speech recognition (LVCSR) based on the amount of information and the confidence measures of constituent words. The set of extracted sentences is compressed by our se...
متن کاملAutomatic Phonetic Alignment and Its Confidence Measures
In this paper we propose the use of an HMM-based phonetic aligner together with a speech-synthesis-based one to improve the accuracy of the global alignment system. We also present a phone durationindependent measure to evaluate the accuracy of the automatic annotation tools. In the second part of the paper we propose and evaluate some new confidence measures for phonetic annotation.
متن کامل